import pandas as pd
import plotly.graph_objects as gp
import numpy as np
import matplotlib.pyplot as plt
tit=pd.read_csv("allTitanic.csv")
tit.head()
| Unnamed: 0 | Name | Age | Boarded | Position | Lifeboat | Body | Sex | Class | Group | Survived | ID | Adult | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | Mr. George Swane | 19.0 | S | chauffeur | NaN | 294MB | m | 2 | 1classServ | False | 1 | True |
| 1 | 2 | Miss Amelia Mary "Mildred" Brown | 18.0 | S | cook | 11 | NaN | f | 2 | 1classServ | True | 2 | True |
| 2 | 3 | Miss Sarah Daniels | 33.0 | S | maid | 8 | NaN | f | 1 | 1classServ | True | 3 | True |
| 3 | 4 | Miss Alice Catherine Cleaver | 22.0 | S | nurse | 11 | NaN | f | 1 | 1classServ | True | 4 | True |
| 4 | 5 | Miss Rosalie Bidois | 46.0 | C | maid | 4 | NaN | f | 1 | 1classServ | True | 5 | True |
Triem representar els salvats del Titanic segon el grup.
Llegim les dades de la Wikipedia en dues adreces URL:
https://en.wikipedia.org/wiki/Passengers_of_the_Titanic
https://en.wikipedia.org/wiki/Crew_of_the_Titanic
que hem tractat conjuntament en un treball previ recollit en un repositori d'ac-uoc de Github,
recollit al fitxer conjunt allTitanic.csv
titg=tit.groupby('Group').count()
titg.head()
| Unnamed: 0 | Name | Age | Boarded | Position | Lifeboat | Body | Sex | Class | Survived | ID | Adult | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Group | ||||||||||||
| 1c0ser | 250 | 250 | 250 | 250 | 250 | 144 | 30 | 250 | 250 | 250 | 250 | 250 |
| 1c1serv | 28 | 28 | 28 | 28 | 28 | 22 | 2 | 28 | 28 | 28 | 28 | 28 |
| 1c2serv | 6 | 6 | 6 | 6 | 6 | 4 | 1 | 6 | 6 | 6 | 6 | 6 |
| 1classServ | 41 | 41 | 41 | 41 | 41 | 29 | 4 | 41 | 41 | 41 | 41 | 41 |
| 2class | 278 | 278 | 278 | 278 | 278 | 118 | 34 | 278 | 278 | 278 | 278 | 278 |
Calculem el percentatge de salvats: tan sols els dels bots es van salvar
titg['%salvats']=100*titg.Lifeboat/titg.Name
titg.head()
| Unnamed: 0 | Name | Age | Boarded | Position | Lifeboat | Body | Sex | Class | Survived | ID | Adult | %salvats | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Group | |||||||||||||
| 1c0ser | 250 | 250 | 250 | 250 | 250 | 144 | 30 | 250 | 250 | 250 | 250 | 250 | 57.600000 |
| 1c1serv | 28 | 28 | 28 | 28 | 28 | 22 | 2 | 28 | 28 | 28 | 28 | 28 | 78.571429 |
| 1c2serv | 6 | 6 | 6 | 6 | 6 | 4 | 1 | 6 | 6 | 6 | 6 | 6 | 66.666667 |
| 1classServ | 41 | 41 | 41 | 41 | 41 | 29 | 4 | 41 | 41 | 41 | 41 | 41 | 70.731707 |
| 2class | 278 | 278 | 278 | 278 | 278 | 118 | 34 | 278 | 278 | 278 | 278 | 278 | 42.446043 |
Ordenem els valors per % salvats
df=titg.sort_values(by='%salvats',ascending=True)
df.head()
| Unnamed: 0 | Name | Age | Boarded | Position | Lifeboat | Body | Sex | Class | Survived | ID | Adult | %salvats | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Group | |||||||||||||
| Orchestra | 8 | 8 | 8 | 8 | 8 | 0 | 3 | 8 | 8 | 8 | 8 | 8 | 0.000000 |
| Postman | 5 | 5 | 5 | 5 | 5 | 0 | 2 | 5 | 5 | 5 | 5 | 5 | 0.000000 |
| garantee | 9 | 9 | 9 | 9 | 9 | 0 | 0 | 9 | 9 | 9 | 9 | 9 | 0.000000 |
| restaurant | 69 | 69 | 69 | 69 | 69 | 3 | 13 | 69 | 69 | 69 | 69 | 69 | 4.347826 |
| Victualling | 389 | 389 | 389 | 389 | 389 | 81 | 79 | 389 | 389 | 389 | 389 | 389 | 20.822622 |
salvats=pd.Series(df['%salvats'])
salvats
Group Orchestra 0.000000 Postman 0.000000 garantee 0.000000 restaurant 4.347826 Victualling 20.822622 Engineering 21.316614 3class 24.435028 2class 42.446043 officer 50.000000 1c0ser 57.600000 1c2serv 66.666667 Deck 69.491525 1classServ 70.731707 1c1serv 78.571429 Name: %salvats, dtype: float64
Repetim el procés restringit als homes, Sex=='m'
titm=tit[tit.Sex=='m']
titmg=titm.groupby('Group').count()
titmg['%salvats']=100*titmg.Lifeboat/titmg.Name
dfm=titmg.sort_values(by='%salvats',ascending=True)
salvatsm=pd.Series(dfm['%salvats'])
salvatsm
Group Orchestra 0.000000 Postman 0.000000 garantee 0.000000 restaurant 1.492537 1classServ 14.285714 2class 15.168539 3class 15.705765 Victualling 18.157182 Engineering 21.316614 1c0ser 34.838710 1c2serv 50.000000 officer 50.000000 1c1serv 60.000000 Deck 69.491525 Name: %salvats, dtype: float64
Pyramid chart dels salvats per grup
fig = gp.Figure()
# Salvats total
fig.add_trace(gp.Bar(y= salvats.index, x = salvats,
name = 'Total salvats',
orientation = 'h'))
# Salvats Sex0'm'
fig.add_trace(gp.Bar(y= salvatsm.index, x = -salvatsm,
name = 'Salvats homes', orientation = 'h'))
# Layout de la gràfica
fig.update_layout(title = '% salvats al Titanic, per grups',
title_font_size = 22, barmode = 'relative',
bargap = 0.0, bargroupgap = 0,
xaxis = dict(tickvals = [-100,-50,0,50,100],
ticktext = ['100', '50', '0',
'50', '100'],
title = '% salvats',
title_font_size = 14)
)
fig.show()